Overlay: Use overlay-aware CLI version when analyzing PRs by henrymercer · Pull Request #3880 · github/codeql-action

henrymercer · 2026-05-06T18:18:36Z

When analyzing PRs, prefer CLI versions that have cached overlay-base databases to speed up analysis time. However to ensure we can effectively rollback new versions, do not use a CodeQL version whose feature flag is disabled, even if this means running without overlay analysis.

This will be shipped via two feature flags:

overlay_analysis_match_codeql_version_dry_run logs a diagnostic when the overlay-aware version differs from the latest enabled version
overlay_analysis_match_codeql_version uses the overlay-aware version when analysing PRs

Risk assessment

For internal use only. Please select the risk level of this change:

Low risk: Changes are fully under feature flags, or have been fully tested and validated in pre-production environments and are highly observable, or are documentation or test only.

Which use cases does this change impact?

Workflow types:

Advanced setup - Impacts users who have custom CodeQL workflows.
Managed - Impacts users with dynamic workflows (Default Setup, Code Quality, ...).

Products:

Code Scanning - The changes impact analyses when analysis-kinds: code-scanning.
Code Quality - The changes impact analyses when analysis-kinds: code-quality.
Other first-party - The changes impact other first-party analyses.

Environments:

Dotcom - Impacts CodeQL workflows on github.com and/or GitHub Enterprise Cloud with Data Residency.

How did/will you validate this change?

Unit tests - I am depending on unit test coverage (i.e. tests in .test.ts files).

If something goes wrong after this change is released, what are the mitigation and rollback strategies?

Feature flags - All new or changed code paths can be fully disabled with corresponding feature flags.

How will you know if something goes wrong after this change is released?

Telemetry - I rely on existing telemetry or have made changes to the telemetry.
- Dashboards - I will watch relevant dashboards for issues after the release. Consider whether this requires this change to be released at a particular time rather than as part of a regular release.
- Alerts - New or existing monitors will trip if something goes wrong with this change.

Are there any special considerations for merging or releasing this change?

No special considerations - This change can be merged at any time.

Merge / deployment checklist

Confirm this change is backwards compatible with existing workflows.
Consider adding a changelog entry for this change.
Confirm the readme and docs have been updated if necessary.

Copilot

Pull request overview

This PR updates how the Action selects the default CodeQL CLI version so that, when analyzing pull requests, it can prefer an enabled CLI version that already has cached overlay-base databases for the configured languages (to speed up overlay/incremental analysis), while still respecting feature-flag rollback constraints.

Changes:

Extend the default CLI “version info” returned from feature flags to include a sorted list of enabled default versions (not just a single version).
Add PR-aware logic in setup-codeql to optionally pick the highest enabled version that intersects with overlay-base DB cache entries (with dry-run telemetry support).
Thread the new version-info shape through callers and update unit tests and the changelog entry.

Show a summary per file

File	Description
src/upload-lib.ts	Switches to the new `getEnabledDefaultCliVersions` API and passes `rawLanguages` through `initCodeQL`.
src/testing-utils.ts	Updates test fixtures/mocks for the new `CodeQLDefaultVersionInfo.enabledVersions` shape.
src/start-proxy.ts	Adapts to the new version-info shape by selecting `enabledVersions[0]` for proxy downloads.
src/start-proxy.test.ts	Updates stubbing to `getEnabledDefaultCliVersions`.
src/setup-codeql.ts	Implements overlay-aware default-version resolution for PR analyses (feature-flag gated) and threads `rawLanguages`.
src/setup-codeql.test.ts	Updates call sites for new `rawLanguages` parameter and adds unit tests for overlay-cache version filtering.
src/setup-codeql-action.ts	Updates to `getEnabledDefaultCliVersions` and passes `rawLanguages` (currently `undefined`).
src/init.ts	Threads `rawLanguages` through to `setupCodeQL`.
src/init-action.ts	Uses `getEnabledDefaultCliVersions` and passes `rawLanguages` derived from the `languages` input.
src/feature-flags.ts	Introduces `CodeQLVersionInfo`, changes default version info to `enabledVersions[]`, and adds new overlay match feature flags.
src/feature-flags.test.ts	Updates tests to validate multi-version enablement ordering/fallback behavior.
src/codeql.ts	Threads `rawLanguages` into tool setup to support PR-aware version resolution.
src/codeql.test.ts	Updates tests for the new default-version info shape and new function signatures.
CHANGELOG.md	Adds an UNRELEASED entry describing the experimental overlay-aware default version selection.
lib/upload-sarif-action-post.js	Generated JS output (not reviewed).
lib/upload-lib.js	Generated JS output (not reviewed).
lib/start-proxy-action.js	Generated JS output (not reviewed).
lib/start-proxy-action-post.js	Generated JS output (not reviewed).
lib/resolve-environment-action.js	Generated JS output (not reviewed).
lib/init-action.js	Generated JS output (not reviewed).
lib/autobuild-action.js	Generated JS output (not reviewed).
lib/analyze-action.js	Generated JS output (not reviewed).
lib/analyze-action-post.js	Generated JS output (not reviewed).

Copilot's findings

Comments suppressed due to low confidence (1)

src/start-proxy.test.ts:1029

The stub variable is named getDefaultCliVersion, but it actually stubs getEnabledDefaultCliVersions. Renaming the local variable (and the later assertion) would avoid confusion and better reflect what’s being tested.

      const getDefaultCliVersion = sinon
        .stub(features, "getEnabledDefaultCliVersions")
        .resolves({
          enabledVersions: [{ cliVersion: "2.20.1", tagName: expectedTag }],
        });
      const path = await startProxyExports.getProxyBinaryPath(logger, features);

      t.assert(getDefaultCliVersion.calledOnce);
      sinon.assert.calledOnceWithMatch(

Files reviewed: 14/26 changed files
Comments generated: 3

+      const version = await resolveDefaultCliVersion(
+        defaultCliVersion,
+        rawLanguages,
+        features,
+        logger,
+      );
+      cliVersion = version.cliVersion;
+      tagName = version.tagName;
    }


mbg

Thanks for breaking this up into (reasonably) sane commits and a separate PR from the previous one! Overall, this looks like a pretty good first stab at this change and I don't see any major issues here.

I have left a bunch of detailed comments. There are a couple in particular about possible issues down the road that we might want to consider documenting or guarding against.

I am wondering if the second FF is a bit overkill and adds an extra layer of complexity on top of an already fairly complex change. I don't feel strongly about this, but the thought crossed my mind whether a reasonable thing to do here to reduce complexity / risk would be to consider breaking this PR up into three separate changes:

The first commit that modifies some of the existing logic to return all FF-enabled CLI versions, but doesn't make use of this information. We can ship that change without a new FF.
Then ship the new logic that's guarded by the second FF. I.e. everything except actually using the CLI version that we determined using the new process.
Then ship the first FF + the logic that uses the CLI version determined using the new process.

mbg · 2026-05-07T15:00:25Z

    // version, or the one enabled by FFs.
    const versionInfo = useFeaturesToDetermineCLI
-      ? await getCliVersionFromFeatures(features)
+      ? (await getCliVersionFromFeatures(features)).enabledVersions[0]


This is an interesting change. It might mean that the bundle release used by start-proxy and init could diverge. That's certainly not a problem right now, since we don't depend on them lining up. (I.e. we have thus far not had any reason to require the proxy to be of at least a particular version for changes made in the CLI/extractors.)

It could also mean that a base database is extracted using the version of the proxy that shipped with the corresponding CLI, and we then end up using a newer proxy for the overlay analysis on a PR. I don't see an immediate problem with it, but it is worth noting.

We might want to think about whether this is something we want to document somehow so that it doesn't become a problem if we ever want to make that assumption.

I was originally thinking we'd do this in a separate PR, but let's just add the languages and analysis-kinds inputs to avoid confusion here.

That solution is OK, but a concern here is that it takes the setup-codeql action a step away from an action that just handles installing the CLI in the sense that it now needs to be aware of what is going to be analysed (if anything). I think the changes are OK though because:

The languages and analysis-kinds inputs are optional (ish - analysis-kinds has a default).

We are still just installing the CLI, but for overlay-enabled Code Scanning analyses the new logic is unavoidable and that requires the information about the (expected) analysis.

Indeed, since we're now basing our choice of CLI on the languages being analysed when doing a code scanning analysis, I think this is a necessity.

mbg · 2026-05-07T15:04:23Z

      getTemporaryDirectory(),
      gitHubVersion.type,
      codeQLDefaultVersionInfo,
+      undefined, // rawLanguages: currently, setup-codeql is not language aware


A potential issue down the line if we ever want to expand on setup-codeql and allow it to run before init. It might be worth documenting the implications of this on overlay analysis here.

Co-authored-by: Michael B. Gale <mbg@github.com>

henrymercer · 2026-05-08T18:27:00Z

I don't feel strongly about this, but the thought crossed my mind whether a reasonable thing to do here to reduce complexity / risk would be to consider breaking this PR up into three separate changes:

The first commit that modifies some of the existing logic to return all FF-enabled CLI versions, but doesn't make use of this information. We can ship that change without a new FF.

I have a mild preference against splitting this PR up further now, but I agree it would have been better to introduce just the "dry run" changes in a separate PR. Happy to factor this into a separate PR if you have a strong preference.

I would prefer to keep the "dry run" changes behind a feature flag though — we're making an extra API request to list the caches, and even though it's not much code, it's safer to roll it out with a flag.

mbg

I have a mild preference against splitting this PR up further now, but I agree it would have been better to introduce just the "dry run" changes in a separate PR. Happy to factor this into a separate PR if you have a strong preference.

It's fine to keep it as-is. I don't feel strongly enough about this to push for splitting it up.

Otherwise the changes look good, thanks for addressing my feedback! I have left a few minor comments.

mbg · 2026-05-12T14:16:00Z

+      `github/codeql-action/init` and `github/codeql-action/analyze` invocations. If specified, the
+      Action may use this list to select a CodeQL CLI version that is best suited to analyzing those
+      languages, for example by preferring a version that has a cached overlay-base database for the


Minor: I'd drop the "If specified, [..], for example by preferring a version that has a cached overlay-base database for the specified languages." part from this description. My thinking here is that:

setup-codeql is mainly about installing the CLI.

So the main thing anyone using this should care about is the tools input.

If tools is provided, I'd expect us to honour it.

If it is unspecified, then languages may affect the CLI version we choose, but so do other things.

Therefore, I am not sure it's worth pointing out here explicitly.

I think this is mostly relevant to us as we develop default setup, indeed it seems unlikely anyone would want to use setup-codeql in an advanced setup. I'm leaning on the side of keeping it to explain why it's there when setup-codeql is otherwise language agnostic.

mbg · 2026-05-12T14:21:01Z

  return languages;
 }

+/** Parses the `languages` input into a list of languages without checking if they are supported by CodeQL. */


Minor: Really all this does is split a comma-separated string into an array, removes excess space characters, converts the strings to lower-case, and removes empty elements. The current description might suggest that something more specific to languages is happening here. How about:

Splits a comma-separated string into an array. Excess spaces are removed and all characters are converted to lower-case.

That's a little implementation specific. Perhaps the main opposition is with "Parses"? I'll update that.

mbg · 2026-05-12T14:27:16Z

+
+      Available options are the same as for the `analysis-kinds` input on the `init` Action.
+    default: 'code-scanning'
+    required: true


required: false?

Since this input has a default, we're disallowing specifying analysis-kinds: null. I don't see why we'd want to allow that, hence disallowing it.

mbg · 2026-05-12T14:33:02Z

    // version, or the one enabled by FFs.
    const versionInfo = useFeaturesToDetermineCLI
-      ? await getCliVersionFromFeatures(features)
+      ? (await getCliVersionFromFeatures(features)).enabledVersions[0]


That solution is OK, but a concern here is that it takes the setup-codeql action a step away from an action that just handles installing the CLI in the sense that it now needs to be aware of what is going to be analysed (if anything). I think the changes are OK though because:

The languages and analysis-kinds inputs are optional (ish - analysis-kinds has a default).

We are still just installing the CLI, but for overlay-enabled Code Scanning analyses the new logic is unavoidable and that requires the information about the (expected) analysis.

henrymercer added 5 commits May 6, 2026 15:14

Add OverlayAnalysisMatchCodeqlVersion feature flag

a796e3e

Expose all enabled default CLI versions

b094211

Match CLI version to cached overlay-base database

55d6319

Add dry run mode so we can dark ship

b967fdf

Add changelog note

1b56327

Copilot AI review requested due to automatic review settings May 6, 2026 18:18

henrymercer requested a review from a team as a code owner May 6, 2026 18:18

Copilot started reviewing on behalf of henrymercer May 6, 2026 18:19 View session

Merge branch 'main' into henrymercer/overlay-match-codeql-version

817b684

github-actions Bot added the size/XL May be very hard to review label May 6, 2026

Copilot AI reviewed May 6, 2026

View reviewed changes

henrymercer added 2 commits May 7, 2026 11:00

Filter to code scanning only

01bc9be

Nit: Dedupe languages

7525c68

mbg reviewed May 7, 2026

View reviewed changes

henrymercer and others added 9 commits May 7, 2026 18:44

Improve changelog note

efc9b0a

Co-authored-by: Michael B. Gale <mbg@github.com>

Merge branch 'main' into henrymercer/overlay-match-codeql-version

0aedbb7

Minor: Introduce constant to avoid duplication

4f815a6

Enable overlay-aware version selection in setup-codeql

2a950b9

Add JSDoc for getRawLanguagesNoAutodetect

9a85234

Remove makeOverlayMatchFeatures indirection

540699d

Remove dead code

42d7f62

Improve error message

87ac48d

Improve tests

b4ea7aa

henrymercer force-pushed the henrymercer/overlay-match-codeql-version branch from 666a561 to b4ea7aa Compare May 8, 2026 18:20

henrymercer requested a review from mbg May 8, 2026 18:20

Use overlay-aware version for code scanning exclusively

201a96b

mbg previously approved these changes May 12, 2026

View reviewed changes

Nit: Tweak JSDoc for getRawLanguagesNoAutodetect

8d21760

henrymercer dismissed mbg’s stale review via 8d21760 May 12, 2026 15:22

Conversation

henrymercer commented May 6, 2026

Risk assessment

Which use cases does this change impact?

How did/will you validate this change?

If something goes wrong after this change is released, what are the mitigation and rollback strategies?

How will you know if something goes wrong after this change is released?

Are there any special considerations for merging or releasing this change?

Merge / deployment checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

mbg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

henrymercer commented May 8, 2026

Uh oh!

mbg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants